Celestin Apprentice 2

home *** CD-ROM | disk | FTP | other *** search

/ Celestin Apprentice 2 / Apprentice-Release2.iso / Information / Digests / CSMP Digest / volume 1 / csmp-v1-154.txt < prev next >

Wrap

Text File | 1994-12-08 | 48.4 KB | 1,142 lines | [TEXT/R*ch]

C.S.M.P. Digest Mon, 03 Aug 92 Volume 1 : Issue 154 Today's Topics: The Bedrock White Paper Does the 68000 have a "block move" instruction The Comp.Sys.Mac.Programmer Digest is moderated by Michael A. Kelly. The digest is a collection of article threads from the internet newsgroup comp.sys.mac.programmer. It is designed for people who read c.s.m.p. semi- regularly and want an archive of the discussions. If you don't know what a newsgroup is, you probably don't have access to it. Ask your systems administrator(s) for details. (This means you can't post questions to the digest.) Each issue of the digest contains one or more sets of articles (called threads), with each set corresponding to a 'discussion' of a particular subject. The articles are not edited; all articles included in this digest are in their original posted form (as received by our news server at cs.uoregon.edu). Article threads are not added to the digest until the last article added to the thread is at least one month old (this is to ensure that the thread is dead before adding it to the digest). Article threads that consist of only one message are generally not included in the digest. The entire digest is available for anonymous ftp from ftp.cs.uoregon.edu [128.223.8.8] in the directory /pub/mac/csmp-digest. The most recent issues are available from sumex-aim.stanford.edu [36.44.0.6] in the directory /info-mac/digest/csmp. If you don't have ftp capability, the sumex archive has a mail server; send a message with the text '$MACarch help' (no quotes) to LISTSERV@ricevm1.rice.edu for more information. The digest is also available via email. Just send a note saying that you want to be on the digest mailing list to mkelly@cs.uoregon.edu, and you will automatically receive each new issue as it is created. Sorry, back issues are not available through the mailing list. Send administrative mail to mkelly@cs.uoregon.edu. ------------------------------------------------------- From: k044477@hobbes.kzoo.edu (Jamie R. McCarthy) Subject: The Bedrock White Paper Organization: Kalamazoo College Date: Wed, 24 Jun 1992 16:12:48 GMT The Bedrock (tm) Cross-Platform Application Framework A White Paper I. Introduction The Bedrock framework is a cross-platform application framework technology. It assists application developers in (1) creating applications of high quality and reliability, (2) quickly delivering these applications across industry standard platforms, and (3) localizing these applications to meet the specific requirements of each country. The Bedrock framework accomplishes these objectives by taking advantage of object-oriented technology to provide the developer a reusable components approach to building applications. Components supplied by the Bedrock Framework can be used to build sophisticated Graphical User Interface (GUI) based applications that are tuned to the specific platform GUI. In addition, the developer may create new components which can be used and shared among multiple development efforts. These components, built on top of the Bedrock framework, are portable across the same platforms as the Bedrock framework. They provide the developer a leveraged productivity tool for use in building and maintaining new applications. II.Background What is an Application Framework? Building GUI based applications on todayUs platforms is a difficult chore. The small task of placing a window on the screen requires the application developer to deal with, and coordinate, a large number of system services; the window manager, the event manager, the memory manager, etc. The application programming interface (API) to these systems are low level, meaning that it takes a large effort to accomplish what seems to be a simple task. Changing the pre-defined behavior of the standard services for an application is difficult. And the problem is further complicated when trying to support multiple platforms. The goal of an application framework is to provide the developer a Rstructure" with functionality common to most applications. The developer than adds to this RstructureS the specific functionality required to complete the application. Creating new types of applications is an incremental effort. The initial value of an application framework is in the head start it provides. It automatically provides the structure for the application to create the application desktop, provide default menus, and handle the interactions between the application and the operating system. Having this initial structure available makes it easier to create the RHello WorldS in a window example. Bedrock requires 30 lines of code, the Windows API requires over 100. Yet, this is not where the real value of an application framework is delivered. Because of the overhead of the application framework structure, RHello WorldS done using an application framework may be easier than using an API, but is more costly in terms of size of the application. However, the RHello WorldS applications are not the ones that developers are building. They are building applications with interactive graphical user interfaces. It is for these kinds of applications that a developer uses an application framework. The framework assists the developer in providing a way of writing application by extending the structure. Instead of having to recreate the structure necessary for each new feature, an application framework allows the developer to reuse the provided structure, and extend it to implement the new feature. As applications grow larger, this ability to extend functionality by incremental effort lessens the complexity of the application. This makes it easier to develop large-scale reliable applications. MacApp, Think C Library (TCL) and ObjectWindowLibrary are all examples of application frameworks. They are used by thousands of developers today. Photoshop from Adobe and SoundEdit Pro from MacroMedia were developed using MacApp. Daymaker from Pastel Development and 1-2-3 from Lotus were developed using TCL. What is a Cross-Platform Application Framework? Application frameworks such as Apple's MacApp for the Macintosh are targeted at specific platforms. Applications written in MacApp will not work on Windows, and applications written in a Windows application framework will not work on the Macintosh. Bedrock uses object-oriented technology to insulate the developer from the specifics of the platform specific programming interface. The developer works with high-level abstract components such as Documents, Panes and Menus. The developer may have access to even higher level components such as a drawing tool or text editing engine. The Bedrock framework then maps those abstract objects to the platform specific functionality, thus ensuring cross-platform performance. The developer benefits from a cross-platform application framework in these significant ways: (1) Less effort to deliver and maintain applications across multiple platforms, (2) Reduced risk of choosing the wrong platform, and (3) Providing applications tuned to the platform GUI. What other support should an Application Framework provide? The United States represents less than 50% of the global software market. Companies need to deliver applications globally, and commercial developers need to be able to easily provide common applications in many languages. In order to accomplish this, an application framework needs complete support for software internationalization. This should include support for different character sets, understanding of country and language specific formatting for date, time and currency, and proper sorting algorithms. III. The Bedrock Foundation The Bedrock Framework The Bedrock framework is a cross-platform application framework technology implemented in C++. Specific implementations are being developed today for the Macintosh and Windows platforms, with others to follow. There are three major elements to the Bedrock Framework: (1) the Bedrock Class Library, (2) Bedrock Resource Information and the (3) Bedrock Utility Manager. The Bedrock Class Library provides the developer with components to build an application. Bedrock Resource Information provides the developer a platform independent description of resource information, such as view positioning, dialogs, and screen for the application. This resource information is portable across platforms. The Bedrock Utility Manager provides the application developer platform independent access to platform specific services such as memory management, file management, international information, and time and date support. The Bedrock Class Library The Bedrock Class Library is implemented in the industry standard C++. The present implementation of the Bedrock Class Library consists of over 150 classes, or types of components that can be used as building blocks by the application developer. The Bedrock framework is being implemented using the class library approach, instead of creating a flat API to provide the developer more flexibility. Due to the power of C++, the application developer can create new components by refining an available component, or combining a number of available components. For example, a ValidatedField is a component which checks input before returning it to an application. A NumberField is a refinement of ValidatedField. A NumberField only accepts numeric characters as input. A DateField only accepts valid dates as input. An application developer could create another refinement of a ValidatedField for inputting specific format of accounts within a company. Once this new component is developed it can be used and reused and shared among developers. The Bedrock Class Library supplies the developer a robust set of components for working with data structures, drawing objects, graphic tools, controls, windows, streams. The following are example components in each area: Functional Area Example Components % Collections Arrays, Matrices, Sets, ... % Draw objects Rectangle, Ellipses, Polygons, ... % Graphics Tools Fonts, Pens, Wallpaper, ... % Controls CheckBoxes, RadioButtons, ListBoxes, ... % Windows Documents, Dialog Boxes, ClipboardWindows, ... % Streams Files, MemoryStreams, TextStreams, ... The Bedrock Class Library provides an abstraction of the platform-specific systems functions. This abstraction frees the developer from worrying about the specifics of each platform. The developer works with a Bedrock class, such as SelectFileDialog. The class library handles the proper mapping of the capability to the platform. The Bedrock Class Library includes a set of components that handle a Rchain of commandS event processing in a unified way, regardless of the underlying platform. Direct calls to the programming interface are rarely necessary. This level of abstraction provided by the class library ensures that the developer will not have to take a Rleast common denominatorS approach. Bedrock Resource Information Bedrock Resource Information provides a powerful definition language for describing menus, dialogs, strings, and accelerator tables. Any visual component may be loaded from a resource. The language also describes custom visual information defined by the application developer. The advantage of separating the resource information from the application code is that applicationUs visual components can be modified without access to the source code of the application. Thus changes to the applicationUs visual components can be made without having to recompile the application, saving time and effort. Bedrock Resource Information provides full scripting support and templates for data description. The compiler for the information includes a full ANSI C preprocessor with optional C++ extensions. This allows the application developer to create and use macros, and define conditional compilation. The information format supports and extends AppleUs Rez, including type definitions. Types can be defined by extension, and types can be defined recursively. Bedrock Resource Information provides support for the entire visual aspect of user interface in a platform independent way. The Bedrock Utility Manager The Bedrock Utility Manager supplies a set of platform utilities in a platform independent manner: Virtual Memory Manager Q allocates and manages memory, provides compaction, relocation and page swapping capabilities. File Manager Q provides access to the file system including network access and name handling from partial specifications. String Manager Q provides standard string functions, connected to the international manager for country-specific processing. Provides conversion routines. Date/Time Manager Q provides standard date and time manipulation, tied to the international manager for country specific processing. International Manager Q provides complete, specific, country, language and code page information. This includes formatting information for numbers, currency, date and time. Provides support for single-, double-, and multiple-byte character sets. Validation Manager Q provides string validation for numbers, date, time and keywords and range and series validation for numbers IV.Future Directions The Bedrock framework is presently being used for internal application development at Symantec. In addition, Symantec is making early versions of the Bedrock framework available to corporate developers and Independent Software Vendors. The information being learned from these efforts is helping Symantec to understand the market requirements and enabling a refinement of the technology. In addition, we have announced a development and marketing agreement with Apple Computer, Inc. Apple will contribute engineering resources and the Bedrock framework will leverage the Apple MacApp object-oriented framework technology. We are also actively seeking other partnerships that will benefit the developer using the Bedrock framework. The results of these partnerships will be reflected in the quality and the robustness of the Bedrock framework and will provide the developer a standard cross-platform application frameworks for all major desktop platforms. Developer Support: Developer Services: Tools & Apps: Cross-Platform Framework 6-23-92 - -- Jamie McCarthy Internet: k044477@kzoo.edu AppleLink: j.mccarthy Never piss off a computer. --------------------------- From: ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) Subject: Does the 68000 have a "block move" instruction Organization: University of Illinois at Urbana-Champaign Date: Mon, 22 Jun 1992 18:27:47 GMT I have a 68000 manual, but I have been unable to find a "mass move" function. By that, I mean given a starting address, a destination address, and the number of bytes, the 68000 would busily crank away at mass moving bytes. I would have thought that the 68000 had an instruction like this, but it appears that it does not. Could anyone confirm this? - -- Eric Johnson | "The American Republic will endure until the day ejohnson@suna0.cs.uiuc.edu | Congress discovers that it can bribe the public eej37047@uxa.cso.uiuc.edu | with the public's money" - Alexis de Toucqueville +++++++++++++++++++++++++++ From: philip@pescadero.Stanford.EDU (Philip Machanick) Date: 22 Jun 92 19:15:47 GMT Organization: Stanford University In article <1992Jun22.182747.27832@sunb10.cs.uiuc.edu>, ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) writes: |> I have a 68000 manual, but I have been unable to find a "mass move" |> function. By that, I mean given a starting address, a destination |> address, and the number of bytes, the 68000 would busily crank away at |> mass moving bytes. I would have thought that the 68000 had an |> instruction like this, but it appears that it does not. Could anyone |> confirm this? On a quick check I couldn't find one either but no need to feel something important is missing. Putting this sort of functionality in hardware is a typical CISC design flaw that makes it horrendous to implement pipelines efficiently. The RISC design trade off is to implement the things that happen often in hardware. The kind of move you want is relatively rarely used for moving anything more than 4 bytes but because it takes a variable size of operand, it is hard to pipeline, and introduces various other nightmares like handling page faults in the middle of an instruction. All the effort that goes into dealing with this stuff is much better put into speeding up things that happen often. [Not just my opinion : see a good architecture book like Hennessy and Patterson.] In short: I hope you are right because the absence of such an instruction means that much more chance that the rumoured 68060 will be out on time and have decent performance. - -- Philip Machanick philip@pescadero.stanford.edu +++++++++++++++++++++++++++ From: Mark.R.Valence@dartmouth.edu (Mark R. Valence) Date: 22 Jun 92 22:34:05 GMT Organization: Dartmouth College, Hanover, NH Well, there are three answers to this question. The first (correct/literal) answer is "no, it does not". See Philip Machanick's response for a good reason why not (although the problems summarized are surmountable). The second (Mac specific answer is "but of course". Here is the code that does a block move: MOVEA.L <source-addr>, A0 MOVEA.L <dest-addr>, A1 MOVE.L <length-in-bytes>, D0 BMOV What you don't have the BMOV Mnemonic in your assembler? Try opcode $A02E, commonly known as _BlockMove. The third answer (neat hack/historical relevance) is to use the MOVEM instruction to quickly move small chunks of data. This scheme is used in MacPaint to get quick bitmap movement. Pretty neat. Mark. +++++++++++++++++++++++++++ From: cramer@unixland.natick.ma.us (Bill Cramer) Date: 23 Jun 92 00:47:28 GMT Organization: Unixland Public Access Unix (508) 655-3848 ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) writes: : I have a 68000 manual, but I have been unable to find a "mass move" : function. By that, I mean given a starting address, a destination : address, and the number of bytes, the 68000 would busily crank away at : mass moving bytes. I would have thought that the 68000 had an : instruction like this, but it appears that it does not. Could anyone : confirm this? Yes there is no block move command:-) The simplest way to perform a block move is with something like: movea.l source,a0 movea.l dest,a1 move.l count,d0 label: move.b (a0)+,(a1)+ dbne d0,label which is more or less a strncpy()-type function (excuse the syntax errors which are no doubt hidden in this code:-). As someone else noted, on a 68020 or later processor, these instructions stay in the instruction cache, so it gives you the same (or better) speed as a block move instruction without the microcode penalty. Bill Cramer 251 West Central Street, Suite 142 | "You can buy better, Natick, MA 01760 USA | but you just can't pay more." Internet: cramer@unixland.natick.ma.us | CIS: 70322,3412 | +++++++++++++++++++++++++++ From: zobkiw@world.std.com (Joe Zobkiw) Date: 23 Jun 92 01:34:11 GMT Organization: The World Public Access UNIX, Brookline, MA In article <1992Jun22.182747.27832@sunb10.cs.uiuc.edu> ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) writes: >I have a 68000 manual, but I have been unable to find a "mass move" >function. By that, I mean given a starting address, a destination >address, and the number of bytes, the 68000 would busily crank away at >mass moving bytes. I would have thought that the 68000 had an >instruction like this, but it appears that it does not. Could anyone >confirm this? > Even the Mac's _BlockMove function simply does a loop of move.l instructions. That's it! - -- - -- joe zobkiw Internet: zobkiw@world.std.com - -- AOL: AFL Zobkiw - -- mac.synthesis.MIDI.THINK C.OOP - -- asm.comm.networks.cool tunes... +++++++++++++++++++++++++++ From: rla20@duts.ccc.amdahl.com (Roger Allen) Date: 23 Jun 92 07:49:31 GMT Organization: Amdahl Corporation, Sunnyvale CA In article <1992Jun22.191547.14537@CSD-NewsHost.Stanford.EDU> philip@pescadero.stanford.edu writes: >In article <1992Jun22.182747.27832@sunb10.cs.uiuc.edu>, ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) writes: >|> I have a 68000 manual, but I have been unable to find a "mass move" >|> function. By that, I mean given a starting address, a destination >|> address, and the number of bytes, the 68000 would busily crank away at >|> mass moving bytes. I would have thought that the 68000 had an >|> instruction like this, but it appears that it does not. Could anyone >|> confirm this? > >On a quick check I couldn't find one either but no need to feel >something important is missing. > >Putting this sort of functionality in hardware is a typical CISC >design flaw that makes it horrendous to implement pipelines >efficiently. The RISC design trade off... > > [blah, blah, blah] > >-- >Philip Machanick >philip@pescadero.stanford.edu Ex-squeeze me, did he ask you for a RISC/CISC comparison/contrast? :^) The answer (I have the '030 book right here) is no. Roger - -- > Roger Allen | All the opinions expressed are my < > Amdahl Computer Development | own and have nothing to do with < > rla20@cd.amdahl.com | Amdahl Corporation. < +++++++++++++++++++++++++++ From: neeri@iis.ethz.ch (Matthias Neeracher) Organization: Integrated Systems Laboratory, ETH, Zurich Date: Tue, 23 Jun 1992 11:31:36 GMT In article <1992Jun23.004728.1707@unixland.natick.ma.us> cramer@unixland.natick.ma.us (Bill Cramer) writes: >ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) writes: >: I have a 68000 manual, but I have been unable to find a "mass move" >: function. By that, I mean given a starting address, a destination >: address, and the number of bytes, the 68000 would busily crank away at >: mass moving bytes. I would have thought that the 68000 had an >: instruction like this, but it appears that it does not. Could anyone >: confirm this? > >Yes there is no block move command:-) The simplest way to perform >a block move is with something like: > > movea.l source,a0 > movea.l dest,a1 > move.l count,d0 >label: move.b (a0)+,(a1)+ > dbne d0,label > >which is more or less a strncpy()-type function (excuse the syntax >errors which are no doubt hidden in this code:-). No syntax errors, as far as I see, but semantic ones: Your loop will terminate when one of the following three conditions is met: 1) One of the bytes moved is a zero (This is what dbne tests). 2) (count MOD 65536)+1 bytes were moved, if count is greater than 65535 3) count+1 bytes were moved, if count is <= 65535 This seems to me a rather conclusive proof that this code will *in no case* perform the intended thing :-) I agree with a previous poster that the following code is a good idea to use: LEA source, A0 LEA dest, A1 MOVE.L count, D0 _BlockMove Not only is this code more likely to be correct and easier to rectivy if it's wrong (which I don't want to exclude :-), it is also *faster* for substantial moves, since _BlockMove uses very carefully unrolled loops (Duff's device and all that), handles 32 bit counts, and deals correctly with overlapping source and destination areas. I think the 68030 versions even know how to take advantage of 16 bytes cache line alignments. The occasions for which hand coded loops might be appropriate are - - If the number of bytes to be moved is usually small - - If source and destination are non-overlapping - - If the number of bytes to be moved is constant (copying a ParamBlockRec) - - If there are conditions for premature termination of the loop (strncpy(), for instance). >As someone else >noted, on a 68020 or later processor, these instructions stay in >the instruction cache, so it gives you the same (or better) speed >as a block move instruction without the microcode penalty. While it is true that you would have to pay a performance penalty by having a blockmove instruction, I'd like to point out that, as far as I know, the 680X0 family *is* microcoded, and the instruction cache doesn't cache microcode. Please don't take anything of what I wrote as a flame. We are, after all, professionals :-) Matthias - ----- Matthias Neeracher neeri@iis.ethz.ch "You must have picked up that copy of Scarlett instead of Inside Mac when you tried to find the right call..." -- Keith Rollin +++++++++++++++++++++++++++ From: Bruce.Hoult@bbs.actrix.gen.nz Organization: Actrix Information Exchange Date: Tue, 23 Jun 1992 17:02:09 GMT In article <1992Jun22.182747.27832@sunb10.cs.uiuc.edu> ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) writes: > I have a 68000 manual, but I have been unable to find a "mass move" > function. By that, I mean given a starting address, a destination > address, and the number of bytes, the 68000 would busily crank away at > mass moving bytes. I would have thought that the 68000 had an > instruction like this, but it appears that it does not. Could anyone > confirm this? It's a little bit of RISC philosophy in a CISC machine -- it only takes two instructions to write a block move... move.l (a0)+,(a1)+ # or .b or .w or choose other registers dbra d0, *-4 # or use another register ...and on anything with an instruction cache ('020 and up) or the "loop mode" in the '010 it will copy memory about as fast as is possible. - -- Bruce.Hoult@bbs.actrix.gen.nz Twisted pair: +64 4 477 2116 BIX: brucehoult Last Resort: PO Box 4145 Wellington, NZ "Cray's producing a 200 MIPS personal computer with 64MB RAM and a 1 GB hard disk that fits in your pocket!" "Great! Is it PC compatable?" +++++++++++++++++++++++++++ From: stu5s11@bcrka280.bnr.ca Organization: Bell-Northern Research, Ottawa, Canada Date: Tue, 23 Jun 1992 17:45:11 GMT Like the previous messages say, there is no block move on the 68000. The closest thing that there would be is the use of DBcc loop on a 68010, which uses a small instruction cache (3 bytes) to keep the entire loop on chip, allow a fast blockmove. The only other instruction close to a block move is on the 68040, that has a MOVE16 instruction that takes advantage of a burst read to transfer 16 bytes at once. The reason for not having a block move instruction has also already been explained. It's not worth it to use up extra hardware for no increase in speed. For example, once during a 6502 vs. Z80 debate, (Ah.. the good old days before 040 vs. 486 debates) someone mentioned the block move instruction on the Z80. I then proceeded to write a 6502 routine that used less cycles than the Z80 hardware instruction. - ---------------------------------------------------------- John Andrusiak +++++++++++++++++++++++++++ From: nagle@netcom.com (John Nagle) Date: Tue, 23 Jun 92 17:15:20 GMT Organization: Netcom - Online Communication Services (408 241-9760 guest) ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) writes: >I have a 68000 manual, but I have been unable to find a "mass move" >function. By that, I mean given a starting address, a destination >address, and the number of bytes, the 68000 would busily crank away at >mass moving bytes. I would have thought that the 68000 had an >instruction like this, but it appears that it does not. Could anyone >confirm this? Most if not all 680x0 implementations provide "loop mode". This mechanism treats two-instruction loops where the loop instruction is a DBcc as a special case, so that the instructions remain in the CPU without being refetched from memory as the loop executes. Thus, all the memory bandwidth can be devoted to the instruction being repeatedly executed. So a MOVE followed by a DBcc will produce the effect of a "mass move" instruction. Still, loop unrolling may be faster. I have yet to see a 68000 compiler smart enough to generate a two instruction loop, though, even though it is quite possible with current compiler technology. John Nagle +++++++++++++++++++++++++++ From: suitti@ima.isc.com (Stephen Uitti) Organization: Interactive Systems, Cambridge, MA 02138-5302 Date: Tue, 23 Jun 1992 23:14:36 GMT In article <1992Jun23.174511.14555@bcrka451.bnr.ca> stu5s11@bcrka280.bnr.ca writes: > >Like the previous messages say, there is no block move >on the 68000. The closest thing that there would be is >the use of DBcc loop on a 68010, which uses a small >instruction cache (3 bytes) to keep the entire loop on >chip, allow a fast blockmove. > >The only other instruction close to a block move is on the >68040, that has a MOVE16 instruction that takes advantage >of a burst read to transfer 16 bytes at once. > >The reason for not having a block move instruction has also >already been explained. It's not worth it to use up extra >hardware for no increase in speed. > >For example, once during a 6502 vs. Z80 debate, (Ah.. >the good old days before 040 vs. 486 debates) someone >mentioned the block move instruction on the Z80. >I then proceeded to write a 6502 routine that used less >cycles than the Z80 hardware instruction. In the days of 6502 vs Z80, there were fewer devices on the chip that these days. Most (but not all) cycles had to do with loading data from memory or storing it to memory. Two byte long instructions generally took twice as long as one byte instructions. Each memory reference takes another cycle per byte. The Z80 block move instruction is two bytes. It loads a byte via one pointer, stores it via another, increments both pointers, decrements a count, and checks the count for zero. If the count is not zero, the Z80 decremented the PC by 2 (which effectively means that the instruction was really a tight loop). Thus, the instruction was read for each byte transfered. I don't recall if the 6502 had two byte loads and stores to speed things up, but loop unrolling can certainly help. Also, Z80 cycles do not compare one-to-one with 6502 cycles. For a 2 MHz Z80, you needed 1 MHz memory. For a 1 MHz 6502, you needed 1 MHz memory. Thus, there was roughly a two to one difference in cycle counts. I remember a claim that with the 8085 undocumented extension of the extra status flag conditional jump instructions, you could outperform a 4 MHz Z80 with a 5 MHz 8085. This was sort of fair, in that you could not yet buy a 5 MHz Z80. I don't believe that the 6502 outperformed a Z80 on block move when using memory of the same speed. The Z80 instruction could have been implemented in fewer cycles - who knows, maybe the Z100 did that. I found that most block copies were short on the Z80, and so performance wasn't really an issue. Maybe it was just for my code, for my applications, under my operating system. I seldom worked with more than a byte at a time. Of course, this is architectual trivia that has nothing to do with a Mac. _Blockmove is hard to beat for portability and speed on a Mac. Stephen. suitti@ima.isc.com +++++++++++++++++++++++++++ From: suitti@ima.isc.com (Stephen Uitti) Organization: Interactive Systems, Cambridge, MA 02138-5302 Date: Tue, 23 Jun 1992 23:20:49 GMT In article <s4qlwtf.nagle@netcom.com> nagle@netcom.com (John Nagle) writes: > I have yet to see a 68000 compiler smart enough to generate a two >instruction loop, though, even though it is quite possible with current >compiler technology. I was very surprized not to see DBxx instructions from Think C. This is a compiler that produces very good code, in general. However, when I actually attempted to use DBxx using ASM statements, I found it to have all sorts of unusual side effects. I was unable to determine what they were from the manaul, or how to code with them by hand. Very, very odd. I didn't spend THAT much time working on it, however. Stephen. +++++++++++++++++++++++++++ From: lsh1@ra.msstate.edu (Shane Hebert) Date: 24 Jun 92 00:31:34 GMT Organization: Mississippi State University suitti@ima.isc.com (Stephen Uitti) writes: >In article <s4qlwtf.nagle@netcom.com> nagle@netcom.com (John Nagle) writes: >> I have yet to see a 68000 compiler smart enough to generate a two >>instruction loop, though, even though it is quite possible with current >>compiler technology. I may have gotten in a little late on this conversation but using the movem.l instruction on the 68000 can move 16 longwords with one instruction. movem.l SOURCE,d0-d7/a0-a7 movem.l d0-d7/a0-a7,DEST works fairly well if you don't mind the destruction of the register contents. Even then you can use this instruction to save all the registers before moving the block of memory. hebert@erc.msstate.edu +++++++++++++++++++++++++++ From: potts@itl.itd.umich.edu (Paul Potts) Date: 24 Jun 92 15:00:24 GMT Organization: Instructional Technology Laboratory, University of Michigan In article <1992Jun22.191547.14537@CSD-NewsHost.Stanford.EDU> philip@pescadero.stanford.edu writes: >In article <1992Jun22.182747.27832@sunb10.cs.uiuc.edu>, ejohnson@sunb3.cs.uiuc.edu (Eric E Johnson) writes: >|> I have a 68000 manual, but I have been unable to find a "mass move" >|> function. By that, I mean given a starting address, a destination >|> address, and the number of bytes, the 68000 would busily crank away at >|> mass moving bytes. I would have thought that the 68000 had an >|> instruction like this, but it appears that it does not. Could anyone >|> confirm this? > You're right, there is no "block move" instruction. There are good reasons not to implement it that have been mentioned. These instructions are nice to have for some purposes - I remember using the block move on the Z-80 processor, with much success - but they don't have a place in a processor which must respond quickly to interrupts. Instructions should generally always be "atomic" and uninterruptible (for example, it is critical that there is a test and set instruction which executes atomically in order to implement semaphores.) A single instruction shouldn't be interruptible (able?) because you would need excessive logic for state recovery. If the user wants to block- move ten thousand bytes using a single instruction, you have effectively locked out interrupt handling for a lot of cycles, which has a great potential to cause very subtle error conditions (loss of network packets, serial line characters, "heartbeat" functionality, etc.) - -- The essence of OOP: "With all this horse manure, I know there's got to be a pony in here somewhere!" Paul R. Potts, Software Designer --- potts@itl.itd.umich.edu <--- me! +++++++++++++++++++++++++++ From: Sherman@128.147.155.27 (Sherman Uitzetter) Date: 24 Jun 92 15:57:13 GMT Organization: NMRI Pittsburgh In article <1992Jun23.212101.18059@bcrka451.bnr.ca> , stu5s11@bcrka280.bnr.ca writes: >Based on what I've heard microcode, is not only on it's way out, >it's long gone. I believe that the 68040 uses microcode only in >limited amounts, or not at all. I'm quite sure this is incorrect. Microcode is definately here to stay - even on RISC processors. Today's processors are simply much too complicated to be "hard-wired." I could see where nanocode might not be used on RISC processors since, by design, the processor is simple enough to do everything with microcode. - -Sherman. +++++++++++++++++++++++++++ From: ephraim@think.com (Ephraim Vishniac) Date: 24 Jun 92 17:53:18 GMT Organization: Thinking Machines Corporation, Cambridge MA, USA In article <1992Jun24.150024.21047@terminator.cc.umich.edu> potts@itl.itd.umich.edu (Paul Potts) writes: >You're right, there is no "block move" instruction. There are good reasons >not to implement it that have been mentioned. These instructions are nice to >have for some purposes - I remember using the block move on the Z-80 processor, >with much success - but they don't have a place in a processor which must >respond quickly to interrupts. Instructions should generally always be >"atomic" and uninterruptible (for example, it is critical that there is >a test and set instruction which executes atomically in order to implement >semaphores.) A single instruction shouldn't be interruptible (able?) because >you would need excessive logic for state recovery. The Z80's block instructions did not lock out interrupts, nor did they "need excessive logic for state recovery." The reason is that these instructions were actually single-instruction loops. Each execution of the instruction only moved one byte. So, interrupt latency remained short - an interrupt could perfectly well be handled while a block instruction loop was in progress. In the Z80, I think the block move instructions (LDIR, LDDR, OTIR, INIR, and others I've doubtless forgotten) were an excellent idea because a limited address space made code density extremely important. Now that address spaces are large and memory is cheap, things look different. - -- Ephraim Vishniac ephraim@think.com ThinkingCorp@applelink.apple.com Thinking Machines Corporation / 245 First Street / Cambridge, MA 02142 One of the flaws in the anarchic bopper society was the ease with which such crazed rumors could spread. +++++++++++++++++++++++++++ From: des7f@fulton.seas.Virginia.EDU (David E. Sappington) Organization: University of Virginia Date: Wed, 24 Jun 1992 18:58:50 GMT potts@itl.itd.umich.edu (Paul Potts) writes: >You're right, there is no "block move" instruction. There are good reasons >not to implement it that have been mentioned. These instructions are nice to >have for some purposes - I remember using the block move on the Z-80 processor, >with much success - but they don't have a place in a processor which must >respond quickly to interrupts. Instructions should generally always be >"atomic" and uninterruptible (for example, it is critical that there is >a test and set instruction which executes atomically in order to implement >semaphores.) A single instruction shouldn't be interruptible (able?) because >you would need excessive logic for state recovery. If the user wants to block- >move ten thousand bytes using a single instruction, you have effectively >locked out interrupt handling for a lot of cycles, which has a great potential >to cause very subtle error conditions (loss of network packets, serial line >characters, "heartbeat" functionality, etc.) > Instructions on the 68010 and higher are not necessarily atomic. The CPU deals with this by putting extra information on the stack in response to an exception (aka interrupt) thus allowing the CPU to "continue" the interrupted instruction rather than "restart" it. A good example of all this occurs when an instruction requires something from a virtual memory page that is not currently resident in RAM. Dave Sappington des7f@virginia.edu Institute for Parallel Computation des7f@virginia.bitnet University of Virginia +++++++++++++++++++++++++++ From: potts@itl.itd.umich.edu (Paul Potts) Date: 24 Jun 92 20:01:09 GMT Organization: Instructional Technology Laboratory, University of Michigan In article <1992Jun24.185850.21424@murdoch.acc.Virginia.EDU> des7f@fulton.seas.Virginia.EDU (David E. Sappington) writes: >potts@itl.itd.umich.edu (Paul Potts) writes: > >Instructions on the 68010 and higher are not necessarily atomic. The CPU >deals with this by putting extra information on the stack in response to >an exception (aka interrupt) thus allowing the CPU to "continue" the >interrupted instruction rather than "restart" it. OK, I've gotten corrected from a number of people (some by mail and some by news). I was wrong. I guess that comes from the fact that the last processor I studied in detail was the 68000 (and I never studied the Z80 in detail, just programmed it some). Sorry about that. I was aware of the way the stack frame is used to handle exceptions on the 68000, but apparently the mechanism has been extended since then. Progress - go figure! : / - -- The essence of OOP: "With all this horse manure, I know there's got to be a pony in here somewhere!" Paul R. Potts, Software Designer --- potts@itl.itd.umich.edu <--- me! +++++++++++++++++++++++++++ From: paul@taniwha.UUCP (Paul Campbell) Date: 29 Jun 92 07:49:09 GMT Organization: Taniwha Systems Design In article <1992Jun22.223405.7704@dartvax.dartmouth.edu> Mark.R.Valence@dartmouth.edu (Mark R. Valence) writes: >Well, there are three answers to this question. The first >(correct/literal) answer is "no, it does not". See Philip Machanick's > >The second (Mac specific answer is "but of course". Here is the code >$A02E, commonly known as _BlockMove. > >The third answer (neat hack/historical relevance) is to use the MOVEM and maybe a fourth - on most 68k varients a : @1: move.l (a0)+, (a1)+ dbra d0, @1 loop fits in the cache and probably executes as fast as the microcode to do a block-move instruction on many other CPUs - with any good memory system it's going to be memory bandwidth bound Paul - -- Paul Campbell UUCP: ..!mtxinu!taniwha!paul AppleLink: CAMPBELL.P "'Potato', not 'Potatoe'" Bart Simpson - on the blackboard 6/25/92 +++++++++++++++++++++++++++ From: paul@taniwha.UUCP (Paul Campbell) Date: 29 Jun 92 14:47:51 GMT Organization: Taniwha Systems Design In article <NEERI.92Jun23123136@iis.ethz.ch> neeri@iis.ethz.ch (Matthias Neeracher) writes: > >I agree with a previous poster that the following code is a good idea to use: > > LEA source, A0 > LEA dest, A1 > MOVE.L count, D0 > _BlockMove > >Not only is this code more likely to be correct and easier to rectivy if it's >wrong (which I don't want to exclude :-), it is also *faster* for substantial On an '040 Mac Apple have patched _BlockMove to flush the cache - this takes a while and slows everything down thereafter. Untill Apple provides us with _BlockMoveNoCache you SHOULD write your own _BlockMove and NOT use the one provided. Just to be pedantic: lea source, a0 lea dest, a1 move.l count, d0 sub.l #1, d0 move.l d0, d1 swap d1 tst.w d1 beq.s @3 @1: move.l #$ffff, d2 @2: move.b (a0)+, (a1)+ dbra d2, @2 dbra d1, @1 @3: move.b (a0)+, (a1)+ dbra d0, @3 should do it Paul - -- Paul Campbell UUCP: ..!mtxinu!taniwha!paul AppleLink: CAMPBELL.P "'Potato', not 'Potatoe'" Bart Simpson - on the blackboard 6/25/92 +++++++++++++++++++++++++++ From: neeri@iis.ethz.ch (Matthias Neeracher) Organization: Integrated Systems Laboratory, ETH, Zurich Date: Wed, 1 Jul 1992 10:23:09 GMT In article <1134@taniwha.UUCP> paul@taniwha.UUCP (Paul Campbell) writes: >In article <NEERI.92Jun23123136@iis.ethz.ch> neeri@iis.ethz.ch (Matthias Neeracher) writes: >> >>I agree with a previous poster that the following code is a good idea to use: >> >> LEA source, A0 >> LEA dest, A1 >> MOVE.L count, D0 >> _BlockMove >> >>Not only is this code more likely to be correct and easier to rectivy if it's >>wrong (which I don't want to exclude :-), it is also *faster* for substantial > >On an '040 Mac Apple have patched _BlockMove to flush the cache - this takes a >while and slows everything down thereafter. As far as I know, flushing the cache is relatively fast. You're right that this slows things down a little, though. But that doesn't excuse the following NONSENSE (sorry, there *is* no other word for it): >Untill Apple provides us with >_BlockMoveNoCache you SHOULD write your own _BlockMove and NOT use the one >provided. WRONG! _BlockMove is still more correct than 95% of all memory move routines I have seen posted here, and for substantial moves, it is faster than 100% of them. Your code is proof of both of these claims: >Just to be pedantic: > > lea source, a0 > lea dest, a1 > move.l count, d0 > sub.l #1, d0 > move.l d0, d1 > swap d1 > tst.w d1 > beq.s @3 >@1: move.l #$ffff, d2 >@2: move.b (a0)+, (a1)+ > dbra d2, @2 > dbra d1, @1 > >@3: move.b (a0)+, (a1)+ > dbra d0, @3 > > should do it No, it doesn't. - - Your code is *incorrect*. Try it with count>=65537. - - Your code is *horrendously inefficient*. I don't have a '040 and a few weeks of spare time to measure the typical effects of a cache flush, but if you think they are worse than replacing _BlockMove (which I heard uses MOVE16 on an '040, provided the alignment is right) with a non-unrolled byte moving loop, you are most likely mistaken. Matthias PS: Excuse me if I sound too harsh, but arguments involving presumed "performance gain" are an especially addictive trap for programmers to fall into, so posting incorrect code in these cases is IMHO a lot worse. - ----- Matthias Neeracher neeri@iis.ethz.ch "A lot of heavy-metal kids are just plain dumb." -- Chris Novoselic +++++++++++++++++++++++++++ From: d88-jwa@dront.nada.kth.se (Jon W{tte) Organization: Royal Institute of Technology, Stockholm, Sweden Date: Wed, 1 Jul 1992 10:41:18 GMT .UUCP> paul@taniwha.UUCP (Paul Campbell) writes: while and slows everything down thereafter. Untill Apple provides us with _BlockMoveNoCache you SHOULD write your own _BlockMove and NOT use the one provided. Just to be pedantic: Yes ! I don't like this over-caution approach Apple has towards protecting people who do strange things to code (like the Segment Loader :-) nit picking: @2: move.b (a0)+, (a1)+ @3: move.b (a0)+, (a1)+ More nits to pick: But... these are byte moves ! You could special-case for both source and destination being even, or both being odd, and use longword moves. Maybe even special-case for source & destination being the same modulo 16 and use MOVE16... - -- Jon W{tte, Svartmangatan 18, S-111 29 Stockholm, Sweden "Difficult, obscure, incoherent and nonstandard does not imply more power." - Andrew Kass in comp.sys.mac.hardware +++++++++++++++++++++++++++ From: bruner@sp15.csrd.uiuc.edu (John Bruner) Date: 1 Jul 92 14:59:39 GMT Organization: CSRD, University of Illinois In article <NEERI.92Jul1112309@iis.ethz.ch> neeri@iis.ethz.ch (Matthias Neeracher) writes: > WRONG! _BlockMove is still more correct than 95% of all memory move routines I > have seen posted here, and for substantial moves, it is faster than 100% of > them. Your code is proof of both of these claims: The most efficient way to move memory depends upon many factors; thus, it is hard to get it right. The best sequence depends upon the alignment of the operands, the size of the move, the processor, whether cache invalidation is to be an issue, etc. This is one of the reasons that it is expensive in terms of development time and transistors to put it in the instruction set: all of the special cases must be handled properly. On all processors it will be most efficient if loads and stores are aligned on natural boundaries. Otherwise the processor must use multiple bus cycles to load or store unaligned data. If the processor has a cache then some of this effect can be hidden, but in the case of write-through cache you still will pay for two write cycles if you write longword data at an address that is not a multiple of four. If possible, align both operands (e.g., if the source is N+n and the destination is M+n, where N and M are multiples of 4, use something like Duff's device or a small MOVE.B/DBRA loop to move the first (4-n) bytes and then use MOVE.L to move the bulk of the bytes. (Fix up the end, as necessary, with a few more byte moves.) The easiest case is the 68010, where data moves from odd to even addresses or vice versa pretty much have to be done with a MOVE.B/DBRA loop and other moves are most efficient with MOVE.L/DBRA. Loop mode eliminates the need for the instruction fetches, which overcomes any potential efficiency you could otherwise gain by unrolling the loop. Of course, no Macintosh products use the 68010, so this doesn't help much. Likewise, on the 68000, data moves from odd to even addresses or vice versa call for a MOVE.B/DBRA loop. Copy longwords in an unrolled loop to minimize the loop overhead. I don't recall the timing of DBRA on the 68000, but it may even be the case that it doesn't pay to use it and you might as well use a single MOVE.L/SUBQ/Bxx loop rather than a nested MOVE.L/DBRA/DBRA one. For the 680[23]0, align the destination at a multiple of four and then copy longwords. Unroll the loop to minimize the loop overhead. I don't know enough about the 68040 to come up with the best strategy. MOVE16 probably will help quite a bit, so you probably want to align the operands on a cache line boundary. The boundary conditions for these cases get messy, particularly if you want to copy in ascending or descending order in order to handle overlapped moves correctly. For all of these reasons, unless you have a very special case, _BlockMove is likely to be a better alternative. - -- (Dr.) John Bruner, Deputy Director bruner@csrd.uiuc.edu Center for Supercomputing Research & Development (217) 244-4476 (voice) University of Illinois at Urbana-Champaign (217) 244-1351 (FAX) 305 Talbot Laboratory; 104 South Wright St.; Urbana, IL 61801 --------------------------- End of C.S.M.P. Digest **********************